56 research outputs found

    Identification of vortex in unstructured mesh with graph neural networks

    Full text link
    Deep learning has been employed to identify flow characteristics from Computational Fluid Dynamics (CFD) databases to assist the researcher to better understand the flow field, to optimize the geometry design and to select the correct CFD configuration for corresponding flow characteristics. Convolutional Neural Network (CNN) is one of the most popular algorithms used to extract and identify flow features. However its use, without any additional flow field interpolation, is limited to the simple domain geometry and regular meshes which limits its application to real industrial cases where complex geometry and irregular meshes are usually used. Aiming at the aforementioned problems, we present a Graph Neural Network (GNN) based model with U-Net architecture to identify the vortex in CFD results on unstructured meshes. The graph generation and graph hierarchy construction using algebraic multigrid method from CFD meshes are introduced. A vortex auto-labeling method is proposed to label vortex regions in 2D CFD meshes. We precise our approach by firstly optimizing the input set on CNNs, then benchmarking current GNN kernels against CNN model and evaluating the performances of GNN kernels in terms of classification accuracy, training efficiency and identified vortex morphology. Finally, we demonstrate the adaptability of our approach to unstructured meshes and generality to unseen cases with different turbulence models at different Reynolds numbers.Comment: Accepted by the journal Computers & Fluid

    Experimenting task-based runtimes on a legacy Computational Fluid Dynamics code with unstructured meshes

    Get PDF
    International audienceAdvances in high performance computing hardware systems lead to higher levels of parallelism and optimizations in scientific applications and more specifically in computational fluid dynamics codes. To reduce the level of complexity that such architectures bring while attaining an acceptable amount of the parallelism offered by modern clusters, the task-based approach has gained a lot of popularity recently as it is expected to deliver portability and performance with a relatively simple programming model. In this paper, we review and present the process of adapting part of Code Saturne, our legacy code at EDF R&D into a task-based form using the PARSEC (Parallel Runtime Scheduling and Execution Control) framework. We show first the adaptation of our prime algorithm to a simpler form to remove part of the complexity of our code and then present its task-based implementation. We compare performance of various forms of our code and discuss the perks of task-based runtimes in terms of scalability, ease of incremental deployment in a legacy CFD code, and maintainability

    Experiments with multi-level parallelism runtimes on a CFD code with unstructured meshes

    Get PDF
    International audienceLarge applications for parallel computers and more specifically unstructured Computational Fluid Dynamics codes are often based on the bulk synchronous parallelism approach (BSP model) and therefore mostly exploit parallelism using runtime systems like the Message Passing Interface (MPI) which has enabled strong, relatively portable and quite durable performances for these codes for many years. However, as MPI was developed for distributed computing, we do not expect its standalone use to be the best fit for recent many-core architectures. Indeed, the ever growing performance of high performance machines we are witnessing comes mostly through the aggregation of different computing devices in order to build heterogeneous machines. Those are the combination of traditional computing units (CPUs) with accelerators, namely many-core architectures such as GPGPUs or Intel Xeon-Phi accelerators. This hybridiza-tion of HPC clusters requires current scientific code developers to master more and more techniques and programming models in order to harness the quintessence of their machines. As the post-petascale era has long been foreseen, runtime systems developers have been investigating other parallelism paradigms. Notably, the task based approach has gained a lot of popularity recently as it is expected to deliver portability and performance with a relatively simple programming model. Tasks can be both local or distant, so a single model can handle both inter-node and intra-node aspects. In addition, computation-communication overlap is straightforward. Opposed to the Coarse Grain Parallelism, performance depends strongly on choosing a good data grain size for each task, which should require performance measuring and tuning but no additional programming effort. As Code_Saturne [1], our CFD code at EDF R&D is based on unstructured meshes, with a significant part of its code being memory-bound, refined parallelism through the use of MPI + X solutions such as MPI + OpenMP often fails to deliver significant (or any) performance improvements , though it does reduce the memory consumption per thread. Using a simple "loop-local" OpenMP model, as we increase the number of threads per MPI rank, performance drops rapidly, since many secondary loops are not threaded; and avoiding data races often requires specific re-numbering strategies, which may not be easily adapted everywhere with a reasonable programming effort. These diminishing returns tend to limit the efforts which are worthwhile to spend in addition to the base MPI model. We may see HPC current technologies evolution as unfavorable to an unstructured CFD code like Code_Saturne in its current form. This is why we decided to investigate recent HPC techniques and runtime systems for a sustainable, easy to propagate, portable and efficient solution to bring better performance and adaptability to Code_Saturne. As many teams are already dedicating their work to propose new solvers and solve dense linear algebra, we decided to focus on another part of the puzzle, namely our gradient reconstruction computation. As a significant portion of our main current numerical schemes, it has a high impact over the performance of our code and an intermediate computational intensity. As such, we propose in this article a review of different implementations of our gradient computation towards the implementation of a task based approach through the use of task-based HPC runtime systems, and more specifically the PaRSEC [2] framework. The Parallel Runtime Scheduling and Execution Control (PaRSEC) framework implements a task-based dataflow-driven programming model aimed at offering high performance while relieving developers of supercomputers' hardware complexity. We show that our first implementation offers comparable performance while increasing the arithmetic intensity of its computation (see figure 1). Moreover, by removing some data dependencies , our cell based approach paves the way for a more refined grain parallelism approach. We then push this approach to our prime objective – task based approaches – and implement our gradient computation using the PaRSEC runtime. Finally, we propose some insights on the use of

    In Situ Statistical Analysis for Parametric Studies

    Get PDF
    International audienceIn situ processing proposes to reduce storage needs and I/O traffic by processing results of parallel simulations as soon as they are available in the memory of the compute processes. We focus here on computing in situ statistics on the results of N simulations from a parametric study. The classical approach consists in running various instances of the same simulation with different values of input parameters. Results are then saved to disks and statistics are computed post mortem, leading to very I/O intensive applications. Our solution is to develop Melissa, an in situ library running on staging nodes as a parallel server. When starting, simulations connect to Melissa and send the results of each time step to Melissa as soon as they are available. Melissa implements iterative versions of classical statistical operations, enabling to update results as soon as a new time step from a simulation is available. Once all statistics ar updated, the time step can be discarded. We also discuss two different approaches for scheduling simulation runs: the jobs-in-job and the multi-jobs approaches. Experiments run instances of the Computational Fluid Dynamics Open Source solver Code_Saturne. They confirm that our approach enables one to avoid storing simulation results to disk or in memory

    Unlocking Large Scale Uncertainty Quantification with In Transit Iterative Statistics

    Get PDF
    International audienceMulti-run numerical simulations using supercomputers are increasingly used by physicists and engineers for dealing with input data and model uncertainties. Most of the time, the input parameters of a simulation are modeled as random variables, then simulations are run a (possibly large) number of times with input parameters varied according to a specific design of experiments. Uncertainty quantification for numerical simulations is a hard computational problem, currently bounded by the large size of the produced results. This book chapter is about using in situ techniques to enable large scale uncertainty quantification studies. We provide a comprehensive description of Melissa, a file avoiding, adaptive, fault-tolerant, and elastic framework that computes in transit statistical quantities of interest. Melissa currently implements the on-the-fly computation of the statistics necessary for the realization of large scale uncertainty quantification studies: moment-based statistics (mean, standard deviation, higher orders), quantiles, Sobol' indices, and threshold exceedance

    High-Performance Computing: Dos and Don’ts

    Get PDF
    Computational fluid dynamics (CFD) is the main field of computational mechanics that has historically benefited from advances in high-performance computing. High-performance computing involves several techniques to make a simulation efficient and fast, such as distributed memory parallelism, shared memory parallelism, vectorization, memory access optimizations, etc. As an introduction, we present the anatomy of supercomputers, with special emphasis on HPC aspects relevant to CFD. Then, we develop some of the HPC concepts and numerical techniques applied to the complete CFD simulation framework: from preprocess (meshing) to postprocess (visualization) through the simulation itself (assembly and iterative solvers)

    Occupational Factors and Socioeconomic Differences in Breast Cancer Risk and Stage at Diagnosis in Swiss Working Women.

    Get PDF
    Socioeconomic differences in breast cancer (BC) incidence are driven by differences in lifestyle, healthcare use and occupational exposure. Women of high socioeconomic status (SES) have a higher risk of BC, which is diagnosed at an earlier stage, than in low SES women. As the respective effects of occupation and SES remain unclear, we examined the relationships between occupation-related variables and BC incidence and stage when considering SES. Female residents of western Switzerland aged 18-65 years in the 1990 or 2000 census, with known occupation, were linked with records of five cancer registries to identify all primary invasive BC diagnosed between 1990 and 2014 in this region. Standardized incidence ratios (SIRs) were computed by occupation using general female population incidence rates, with correction for multiple comparisons. Associations between occupation factors and BC incidence and stage at diagnosis were analysed by negative binomial and multinomial logistic regression models, respectively. The cohort included 381,873 women-years and 8818 malignant BC, with a mean follow-up of 14.7 years. Compared with reference, three occupational groups predominantly associated with a high socioprofessional status had SIRs > 1: legal professionals (SIR = 1.68, 95%CI: 1.27-2.23), social science workers (SIR = 1.29; 95%CI: 1.12-1.49) and some office workers (SIR = 1.14; 95%CI: 1.09-1.20). Conversely, building caretakers and cleaners had a reduced incidence of BC (SIR = 0.69, 95%CI: 0.59-0.81). Gradients in BC risk with skill and socioprofessional levels persisted when accounting for SES. A higher incidence was generally associated with a higher probability of an early-stage BC. Occupation and SES may both contribute to differences in risk and stage at diagnosis of BC

    Estimating 10-year risk of lung and breast cancer by occupation in Switzerland.

    Get PDF
    INTRODUCTION Lung and breast cancer are important in the working-age population both in terms of incidence and costs. The study aims were to estimate the 10-year risk of lung and breast cancer by occupation and smoking status and to create easy to use age-, and sex-specific 10-year risk charts. METHODS New lung and breast cancer cases between 2010 and 2014 from all 5 cancer registries of Western Switzerland, matched with the Swiss National Cohort were used. The 10-year risks of lung and breast cancer by occupational category were estimated. For lung cancer, estimates were additionally stratified by smoking status using data on smoking prevalence from the 2007 Swiss Health Survey. RESULTS The risks of lung and breast cancer increased with age and were the highest for current smokers. Men in elementary professions had a higher 10-year risk of developing lung cancer compared to men in intermediate and managerial professions. Women in intermediate professions had a higher 10-year risk of developing lung cancer compared to elementary and managerial professions. However, women in managerial professions had the highest risk of developing breast cancer. DISCUSSION The 10-year risk of lung and breast cancer differs substantially between occupational categories. Smoking creates greater changes in 10-year risk than occupation for both sexes. The 10-year risk is interesting for both patients and professionals to inform choices related to cancer risk, such as screening and health behaviors. The risk charts can also be used as public health indicators and to inform policies to protect workers

    Massively parallel numerical simulation using up to 36,000 CPU cores of an industrial-scale polydispersed reactive pressurized fluidized bed with a mesh of one billion cells

    Get PDF
    For the last 30 years, experimental and modeling studies have been carried out on fluidized bed reactors from laboratory up to industrial scales. The application of developed models for predictive simulations has however been strongly limited by the available computational power and the capability of computational fluid dynamics software to handle large enough simulations. In recent years, both aspects have made significant advances and we thus now demonstrate the feasibility of a massively parallel simulation, on whole supercomputers using NEPTUNE_CFD, of an industrial-scale polydispersed fluidized-bed reactor. This simulation of an olefin polymerization reactor makes use of an unsteady Eulerianmulti-fluid approach and relies on a billion cellsmeshing. This is a worldwide premiere as the obtained accuracy is yet unmatched for such a large-scale system. The interest of this work is two-fold. In terms of High Performance Computation (HPC), all steps of setting-up the simulation, running it with NEPTUNE_CFD, and post-processing results induce multiple challenges due to the scale of the simulation. The simulation ran using 1260 up to 36,000 cores on supercomputers, used 15 millions of CPU hours and generated 200 TB of rawdata for a simulated physical time of 25s. This article details the methodology applied to handle this simulation, and also focuses on computation performances in terms of profiling, code efficiency and partitioning method suitability. Though being by itself interesting, the HPC challenge is not the only goal of this work as performing this highly-resolved simulation will benefit chemical engineering and CFD communities. Indeed, this computation enables the possibility to account, in a realistic way, for complex flows in an industrial-scale reactor. The predicted behavior is described, and results are post-processed to develop sub-grid models. These will allow for lower-cost simulations with coarser meshes while still encompassing local phenomenon

    Broadcast-enabled massive multicore architectures: a wireless RF approach

    Get PDF
    Broadcast traditionally has been regarded as a prohibitive communication transaction in multiprocessor environments. Nowadays, such a constraint largely drives the design of architectures and algorithms all-pervasive in diverse computing domains, directly and indirectly leading to diminishing performance returns as the many-core era is approaching. Novel interconnect technologies could help revert this trend by offering, among others, improved broadcast support, even in large-scale chip multiprocessors. This article outlines the prospects of wireless on-chip communication technologies pointing toward low-latency (a few cycles) and energy-efficient broadcast (a few picojoules per bit). It also discusses the challenges and potential impact of adopting these technologies as key enablers of unconventional hardware architectures and algorithmic approaches, in the pathway of significantly improving the performance, energy efficiency, scalability, and programmability of many-core chips.Peer ReviewedPostprint (author's final draft
    corecore